The Internal Data Format for ctmc_fit

The function ctmc_fit expect the data to be structured as follows



In [1]:

    
data = [([0, 1, 2, 1], [2.2, 3.35, 9.4, 1.3]), 
        ([1, 0, 1], [4.0, 1.25, 1.7])]

Each example or event chain is one element in a array data.

The first entry of entry of an example row is a list of states,
the second entry a list time periods a state lasted.

How does it work in ctmc_fit?

Initialize variables



In [7]:

    
import numpy as np
numstates = 3
statetime = np.zeros(numstates, dtype=float)
transcount = np.zeros(shape=(numstates, numstates), dtype=int)

Loop over all examples, and cumulate time periods and count transitions across all examples.



In [12]:

    
for _, example in enumerate(data):
    states = example[0]
    times = example[1]
    
    for i,s in enumerate(states):
        statetime[s] += times[i]
        if i: transcount[states[i-1], s] += 1

The intermediate results are



In [13]:

    
statetime









    Out[13]:





array([10.35, 31.05, 28.2 ])



In [14]:

    
transcount









    Out[14]:





array([[0, 6, 0],
       [3, 0, 3],
       [0, 3, 0]])



In [6]:

    
#from scipy.sparse import lil_matrix
#transcount = lil_matrix((numstates, numstates), dtype=int)
#transcount.toarray()